Combinatorial Multi-armed Bandits for Real-Time Strategy Games

نویسندگان

چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Combinatorial Multi-armed Bandits for Real-Time Strategy Games

Games with large branching factors pose a significant challenge for game tree search algorithms. In this paper, we address this problem with a sampling strategy for Monte Carlo Tree Search (MCTS) algorithms called näıve sampling, based on a variant of the Multiarmed Bandit problem called Combinatorial Multi-armed Bandits (CMAB). We analyze the theoretical properties of several variants of näıve...

متن کامل

The Combinatorial Multi-Armed Bandit Problem and Its Application to Real-Time Strategy Games

Game tree search in games with large branching factors is a notoriously hard problem. In this paper, we address this problem with a new sampling strategy for Monte Carlo Tree Search (MCTS) algorithms, called Naı̈ve Sampling, based on a variant of the Multi-armed Bandit problem called the Combinatorial Multi-armed Bandit (CMAB) problem. We present a new MCTS algorithm based on Naı̈ve Sampling call...

متن کامل

Combinatorial Multi-Armed Bandits with Filtered Feedback

Motivated by problems in search and detection we present a solution to a Combinatorial Multi-Armed Bandit (CMAB) problem with both heavy-tailed reward distributions and a new class of feedback, filtered semibandit feedback. In a CMAB problem an agent pulls a combination of arms from a set {1, ..., k} in each round, generating random outcomes from probability distributions associated with these ...

متن کامل

Combinatorial Pure Exploration of Multi-Armed Bandits

We study the combinatorial pure exploration (CPE) problem in the stochastic multi-armed bandit setting, where a learner explores a set of arms with the objective of identifying the optimal member of a decision class, which is a collection of subsets of arms with certain combinatorial structures such as size-K subsets, matchings, spanning trees or paths, etc. The CPE problem represents a rich cl...

متن کامل

Towards Distribution-Free Multi-Armed Bandits with Combinatorial Strategies

We consider the following linearly combinatorial multiarmed bandits (MABs) problem. In a discrete time system, there are K unknown random variables (RVs), i.e., arms, each evolving as an i.i.d stochastic process over time. At each time slot, we select a set of N (N ≤ K) RVs, i.e., strategy, subject to an arbitrarily constraint. We then gain a reward that is a linear combination of observations ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Journal of Artificial Intelligence Research

سال: 2017

ISSN: 1076-9757

DOI: 10.1613/jair.5398